Supply Chain Yield Management based on Reinforcement Learning
نویسندگان
چکیده
The paper presents a yield optimizing scheduling system (YOSS) in a decentralized supply chain management (DSCM) environment. For this purpose we employ a DSCM scenario in which each supplier, provides system parts by order of one or more customers. The supplier tries to optimize its return by searching the optimal schedule for the single machine he uses to manufacture the requested products. The schedule is based on the suppliers knowledge about delivery due dates, accomplishment priorities and job prices associated to the production tasks. The optimizer constructs a mandatory schedule by inserting the requested jobs, which arrive stochastically from the customers, gradually into the production queue if the job yields a sufficient return. The tasks are finally executed on the suppliers machine following the fixed queue schedule. The objective of YOSS is to learn an optimal acceptance strategy for the offered jobs. For this purpose YOSS is divided into a deterministic scheduling component (DSC), which does the processing queue assignment of the jobs according to the delivery due date and timeout penalty cost and a reinforcement learning algorithm (RLA), which makes the acceptance decision using a job price timeout penalty cost difference as optimization criterion. 1 Learning Agents in Decentralized Supply Chain Optimization Supply chain management (SCM) is concerned with the efficient allocation of production resources in a cross plant environment. Besides the allocation aspect, profit considerations of the individual supply chain partners / competitors play an important role for the attribution of tasks. It is therefore advisable form the economic point of view to employ the distributed decision structures of multi agent systems (MAS) for the SCM task optimization and allocation process (Barbuceanu & Fox 1996) (Walsh & Wellman 1999). Taking the dependencies of the underlying production techniques into account, the SCM allocation problem presents itself as an algorithmic problem of combinatorial complexity (NP-hard). One way to deal with this problem is to create time dependent price functions and to employ stochastic optimization methods (SOM) to attain a near optimal allocation (Stockheim, Schwind, Wendt & Grolik 2002). Other approaches address the combinatorial allocation problem aspect directly while using auctions, especially the combinatorial auction (CA), which is able to take account of the utility interdependencies of the negotiated goods and resource bundles. Early work on this topic has been done by Wellman (1993) formulating a model of decentralized markets and solving the allocation problem by means of a Walras tâtonnement process. One auctioneer per good adjusts the prices associated with the material flow according to supply and the aggregate demand caused by the other SCM participants. The process is repeated sequentially for all goods until the market equilibrium is reached (Walsh, Wellman, Wurman & MacKie-Mason 1998). Pontrandolfo, Gosavi, Okogbaa & Das (2002) provide a global supply chain management (GSCM) approach. Based on a reinforcement learning algorithm (RLA) called SMART (Semi Markov Average Reward Technique) RL-GSCM has to learn the optimal procurement and distribution policy for a supply system consisting of production plants, export warehouses, import warehouses and final markets in three countries. While calculating production and transportation costs using currency exchange rates, tariffs, production-, inventory, late delivery and transportation costs, the RLA chooses between three possible suppliers and one of two transportation modes. SMART is tested by applying various demand patterns to the supply chain, grounding on a Erlang probability distribution modified by diverse mean and deviation parameters. Compared with two heuristics, one called LH standing for local heuristic preferring a inner country production and distribution policy and another denoted as BH balanced heuristic while issuing demand mainly to warehouses with low capacity load, the SMART allocation mechanism provides the highest reward.
منابع مشابه
Inventory management in supply chains: a reinforcement learning approach
A major issue in supply chain inventory management is the coordination of inventory policies adopted by di!erent supply chain actors, such as suppliers, manufacturers, distributors, so as to smooth material #ow and minimize costs while responsively meeting customer demand. This paper presents an approach to manage inventory decisions at all stages of the supply chain in an integrated manner. It...
متن کاملCase-based reinforcement learning for dynamic inventory control in a multi-agent supply-chain system
Reinforcement learning (RL) appeals to many researchers in recent years because of its generality. It is an approach to machine intelligence that learns to achieve the given goal by trial-and-error iterations with its environment. This paper proposes a case-based reinforcement learning algorithm (CRL) for dynamic inventory control in a multi-agent supply-chain system. Traditional time-triggered...
متن کاملGlobal Supply Chain Management: A Reinforcement Learning Approach
In recent years, researchers and practitioners alike have devoted a great deal of attention to supply chain management (SCM). The main focus of SCM is the need to integrate operations along the supply chain as part of an overall logistic support function. At the same time, the need for globalization requires that the solution of SCM problems be performed in an international context as part of w...
متن کاملConceptual Agent based Modeling in Supply Chain: An Economic Perspective
Abstract: The implementation of government legislation, social responsibility, environmental concerns regarding the reduction of waste, hazardous material and other consumer residuals have made the competition between the firms stricter than ever and nowadays firms that want to survive need a more productive and innovative approach toward the financial aspects of their businesses.his pape...
متن کاملMeta Synthesis and Fuzzy Interpretive Structural Modeling of Talent Supply Chain Management in National Iranian Oil Company
The purpose of this paper was to present the talent supply chain management model of the National Iranian Oil Company. In term of aim, this research is considered an applied and developmental research based on mixed method. The statistic population in qualitative step consisted articles in domstic and foreign scientific information bases related to talent management and in Quantitative step the...
متن کامل